Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disabled managed identity on azureml compute instance #47

Merged
merged 1 commit into from
Feb 17, 2023

Conversation

j0rd1smit
Copy link
Contributor

This fixes issue #45

Additional explanation

Ok, I did a deep dive into the problem, and I found the issue. AzureML compute instance set by default the MSI_ENDPOINT and MSI_SECRET environment variables for on compute instances. Even if this managed identity has no rights. The problem is that the DefaultAzureCredential will prioritize managed identities over CLI logins. This would be fine if DefaultAzureCredential would validate if it selected strategy would work. Instead, it greedily picks the first one that could work, and if it does not work, it does not try the others. So, in this case, it finds the MSI_ENDPOINT env variable, so it decides to go for the AzureMLCredential when this one fails to obtain to get the token, it no longer checks the other options. So in this code:

 try:
        credential = DefaultAzureCredential()
        # Check if given credential can get token successfully.
        credential.get_token("https://management.azure.com/.default")
    except Exception:
        # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential not work
        credential = InteractiveBrowserCredential()

We will always fall back to the InteractiveBrowserCredential.
Sadly, the InteractiveBrowserCredential also does not work since azure compute instances are headless.
We could replace InteractiveBrowserCredential with DeviceCodeCredential. This works, but an annoying side effect is that we needed re-login every time we submit a job.

So, I believe the best course of account would be to check if you are on an AzureML compute instance, and if this is the case, tell DefaultAzureCredential to ignore the managed identity credential. (ManagedIdentityCredential does this by checking for the presence of the MSI_ENDPOINT environment variable`).

@marrrcin
Copy link
Contributor

Thanks for your research and contribution @j0rd1smit ! We will accept your PR, but first please rebase your branch to the latest develop, as we've applied a fix in the pre-commit configuration which makes the pre-commit/unit tests pass. Without this fix they will fail.

@j0rd1smit
Copy link
Contributor Author

If everything went correctly, I rebased it. If something when wrong, let me know. (first time rebase)

@marrrcin
Copy link
Contributor

I'm not sure that you've rebased correctly because now your PR shows changes that were already merged to our develop.

Please do the following:

  1. Rollback what you did with the rebase recently (git reset --hard f7b2e47c2a37898b666362ee19d8c4f1f6b814a3 on your branch).
  2. Fetch changes from our repo (either git fetch origin develop or git fetch <name of the remote of GetInData's GitHub> develop.
  3. Just do git rebase origin/develop or git rebase <name of the remote of GetInData's GitHub>/develop
    After those operations you commit history should look like you've added your change after the changes made on our develop branch.
  4. Push changes to your fork (most likely you will have to use --force with git push command)

@j0rd1smit j0rd1smit force-pushed the fix/compute_instance branch from 9c58ded to 7e1ac39 Compare February 17, 2023 14:02
@marrrcin marrrcin merged commit c70cb18 into getindata:develop Feb 17, 2023
@marrrcin
Copy link
Contributor

I've merged it, but please wait for feedback, as we only can run e2e tests on our branches right now.

@marrrcin
Copy link
Contributor

It seems to work fine in e2e tests. We will release it on Monday then!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants